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ABSTRACT 

We analyze environmental correlations using mark clustering statistics with the mock 
galaxy catalogue constructed by Muldrew et al. (Paper I). We find that mark corre- 
lation functions are able to detect even a small dependence of galaxy properties on 
the environment, quantified by the overdensity 1 + 5, while such a small dependence 
would be difficult to detect by traditional methods. We then show that rank ordering 
the marks and using the rank as a weight is a simple way of comparing the correlation 
signals for different marks. With this we quantify to what extent fixed-aperture over- 
densities are sensitive to large-scale halo environments, nearest-neighbor overdensities 
are sensitive to small-scale environments within haloes, and colour is a better tracer 
of overdensity than is luminosity. 

Key words: methods: statistical - galaxies: evolution - galaxies: haloes - galaxies: 
clustering - large scale structure of the universe 



In hierarchical clustering models, clues about the galaxy for- 
mation process are encoded in correlations between galaxy 
properties and their environments. This has motivated mea- 
surements of such correlations. Traditional measures are in- 
tended to allow one to quantify if galaxies in dense regions 
tend to be more luminous, or redder, or older, or tend to 
move faster than average, and so on. These conclusions de- 
pend critically on how the density N g /V was estimated: 
fixed aperture measurements count the number of galax- 
ies N g that are within volume V of an object (i.e., the nu- 
merator of the ratio N g /V varies from one object to an- 
other), whereas near- neighbour measurements find the V 
that contains N g nearest neighbours (i.e., the denominator 
is stochastic). Clearly, the size and shape of V, or the choice 
of N g matter greatly (the universe is homogeneous on suf- 
ficiently large N g or V). In addition, the choice of three- 
dimensional or projected surface density matters as well, 
as do the redshift uncertainties and sample selection. De- 
termining which of the many observed correlations is funda- 
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mental, and which is a consequence of others, can be a subtle 
task, especially since the environment is often the least well 
determined of a galaxy's attributes. Moreover, the estimate 
of the environment is often sufficiently complicated that it 
cannot be modelled or interpreted analytically. 

Mark clustering statistics are fundamentally different, 
in the sense that they are, strictly speaking, statements 
about pairs, triples, quadruples, etc., of galaxies, rather than 
about single objects (Stoyan & Stoyan 1994). For example, 
the most commonly used such statistic returns an estimate 
of how the properties of galaxy pairs (rather than of sin- 
gle galaxies) depend on pair separation. (While this is eas- 
ily extended to triples, quadruples, etc., such estimates are 
rarely ever made.) In essence, for each pair separation r, this 
statistic weights each galaxy in a pair by its own attribute 
(e.g., luminosity, colour, etc., expressed in units of the mean 
across the population) and then divides this weighted pair 
count by the unweighted one. Symbolically, one may write 
this statistic as WW(r)/DD(r), where WW and DD stand 
for the weighted and unweighted pair counts at separation 
r. 

Previous estimates of WW/ DD have shown that close 
pairs of galaxies are more luminous (Beisbart & Kerscher 
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2000), redder (Skibba et al. 2006), and older (Sheth et al. 
2006) than average (WW/ DD > 1 for r less than a few 
/i _1 Mpc). While these trends are qualitatively the same 
as those returned by traditional estimates, and they are 
also in qualitative agreement with galaxy formation models 
(Sheth 2005), the mark statistics are particularly interest- 
ing because a theoretical framework exists for interpreting 
such measurements quantitatively (Sheth 2005; Skibba et 
al. 2006). On the other hand, this is also a drawback, be- 
cause the theoretical framework is almost required if one 
wishes to draw more than qualitative conclusions from such 
measurements. This is because the magnitude of the (say) 
luminosity- weighted signal changes if one weights instead by 
the log of the luminosity (Sheth, Connolly & Skibba 2005; 
Skibba et al. 2006). Since the same physics has led to both 
signals, one would like the measurement to not depend on 
the 'units' in which the measurement was made. 

This dependence on 'units' derives from the fact that 
the magnitude of WW/ DD depends on the distribution of 
the weights (e.g., its width, the length of its tails, etc.). 
Needless to say, it also complicates efforts to determine if 
one observable correlates more strongly with environment 
than another. As a case in point, it has long been known that 
cluster galaxies tend to be redder than average, but there 
is a wide range in luminosity between the brightest cluster 
galaxy (BCG) and the dwarf satellites in a cluster. Since 
clusters are regions of high density, one naively expects to 
find that colour correlates more strongly with environment 
than does luminosity. However, the mark correlation signal 
appears to have a larger amplitude for luminosity than it 
does for colour (Skibba & Sheth 2009). One of the main 
goals of the present work is to show how to remove this 
effect from the measurement, so that the magnitude of the 
signal can be compared across different weights. 

In the next section, we describe the galaxy catalogues 
used throughout the paper. In Section we demonstrate 
that mark correlations are particularly sensitive probes of 
environmental correlations; they correctly show a strong sig- 
nal even when traditional estimates based on how one-point 
statistics vary as a function of environment are unable to 
see one - a fact that was recently exploited by Paranjape & 
Sheth (2012). In SectionUwe show how to remove the effect 
of the mark correlation signal on 'units', arguing that for 
any given weight, one should simply rank order and use the 
rank as the weight. We then use these rank-ordered mark 
correlations to compare different mark correlation signals 
with one another. A final section summarizes our findings. 

Throughout this paper we assume a spatially flat cos- 
mology with f2 m = 0.25 and Qa = 0.75, and as = 0.9. We 
write the Hubble constant as Ho — 100/t km s _1 Mpc -1 . 



2 DATA 

2.1 Mock Galaxy Catalogue 

To illustrate our methods and to interpret some of our re- 
sults, we use the mock galaxy catalogue of Muldrew et al. 
(2012; hereafter M12). We refer the reader to M12 for details 
about the dark matter simulation, halo-finding algorithm, 
and the procedure for populating the haloes with galaxies. 
We begin with the Millennium Simulation (Springel et 
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Figure 1. Slice of the redshift-space mock light cone, show- 
ing galaxies within ±4 deg. Red/blue points are galaxies on 
the red/blue sequences of the colour-magnitude diagram, and 
larger/smaller points are brighter/fainter galaxies. 

al. 2005) , which is a large iV-body simulation of dark matter 
structure in a cosmological volume. Dark matter particles 
are traced in a cubic box of 500/i _1 Mpc on a side, with a 
halo mass resolution of ~ 5 x 10 10 /i _1 Mq. Collapsed haloes 
with at least 20 particles are identified with a friends-of- 
friends group finder. 

The haloes are populated with galaxies with luminosi- 
ties and colours, following the algorithm described in Skibba 
et al. (2006) and Skibba & Sheth (2009), which is con- 
strained by the luminosity and colour distribution and clus- 
tering in the Sloan Digital Sky Survey (SDSS; York et al. 
2000). An important assumption in the model is that all 
galaxy properties — their numbers, spatial distributions, ve- 
locities, luminosities, and colours — are determined by halo 
mass alone. We specify a minimum r-band luminosity for 
the galaxies in the catalogue, M r — —19, to stay well above 
the resolution limit of the Millennium Simulation, avoiding 
any issues of completeness that may bias our results. 

This procedure produces a mock galaxy catalogue con- 
taining 1.84 million galaxies, of which 29 percent are 'satel- 
lite' galaxies. Galaxies occupy haloes with masses ranging 
from 10 11 to 1O 15,3 /i _1 M . We show a slice of the mock light 
cone in Figure [1] in which more luminous galaxies are iden- 
tified with larger points, and red and blue sequence galaxies 
with red and blue points. 

2.2 SDSS Galaxy Catalogue 

For comparison, we will also show clustering measurements 
in the main galaxy sample of SDSS Seventh Data Release 
(DR7; Abazajian et al. 2009), in a catalogue volume-limited 
to M r < -19, with 0.02 < z < 0.0642. 

Clustering measurements of galaxy redshift surveys 
have traditionally been done by splitting a catalogue in lu- 
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Figure 2. Upper panel: projected correlation function (circle 
points) and r-band luminosity-weighted correlation function (tri- 
angles) of the full M r < —19 mock catalogue. Lower panel: 
CFs for luminosity bins —20 < M r < —19 (blue solid line), 
-21 < M r < -20 (green long-dashed line), -22 < M r < -21 
(red short-dashed line), and —23 < M r < —22 (magenta dotted 
line). The crosses are the CFs measured from SDSS DR7 (Ze- 
havi et al. 2011), and the measurement for —21 < M r < —20 is 
omitted, for clarity. 



where WW/DD is the pair count ratio. The mark pro- 
jected correlation function is similarly denned: M(r p ) = 
(1 + Wp/r p ) /(l + Wp/rp). If the weighted and unweighted 
clustering are significantly different at a particular separa- 
tion r, then the mark is correlated (or anti-correlated) with 
the environment at that scale; the degree to which they are 
different quantifies the strength of the correlation. 

In this section, we illustrate that mark correlation func- 
tions are particularly sensitive to environmental effects. We 
do so by introducing a small additional dependence of a 
galaxy mark w (luminosity or colour here) on the galaxy's 
overdensity (using one of the environment measures denned 
below). (We use the letter w because, when we discuss mark 
correlations below, we treat the mark w as a 'weight'.) That 
is, if a galaxy has mark w, then we change it to 



w(l + S) a . 



(2) 



We then rank order the marks w a and rescale them so that 
they have the same distribution as before the environmental 
effect was added. I.e., we require 



p(> w a ) = p(> w). 



(3) 



Therefore, by construction, there is no trace of the additional 
correlation with environment in the one-point statistics of 
w a ; it is only by studying how the one-point distribution 
changes as a function of environment (the traditional ap- 
proach), or by measuring spatial correlations (such as mark 
correlations), that one might discover this correlation. The 
question is: which approach is more efficient, especially for 
small a when w a ~ w + aw81 



minosity bins (e.g., Norberg et al. 2002; Zehavi et al. 2005, 
2011; Li et al. 2006; Coil et al. 2008). We compare such 
clustering in the SDSS and in the mock catalogue in Fig- 
ure [2] There is generally very good agreement, except at 
0.7 < r v < 3Mpc//i for the brightest galaxies. Note that 
the mock catalogue was constrained by earlier SDSS data 
sets (Zehavi et al. 2005; Skibba et al. 2006), while these new 
measurements use the full DR7 data set (Zehavi et al. 2011). 

Most of the analysis of environmental correlations 
throughout this paper is based on the mock galaxy cata- 
logue, with additional comparisons to the SDSS galaxy cat- 
alogue in Section \4. 41 



3 SENSITIVITY TO ENVIRONMENTAL 
CORRELATIONS 

In what follows, we will refer to any property of a galaxy, 
e.g., its luminosity or its colour, as a 'mark'. As stated 
above, the most commonly used mark statistic is the mark 
correlation function, which is denned as the ratio of the 
weighted/unweighted correlation function: 



M(r) 



W{r) WW{r) 



l + £(r) DD(r) 



(1) 



1 The clustering measurements in this paper are performed using 
the TVtropy code developed by Gardner, Connolly, & McBride 
(2007). 



3.1 Measures of environment 

There are many different methods of quantifying the envi- 
ronment, but most of them can be categorized into those 
that use a fixed aperture (FA) and others that use near- 
neighbor (NN) finding. A variety of environment measures 
are analyzed in M12, and we use a subset of these, which 
are briefly described below. Unless stated otherwise, all of 
the FA and NN overdensities used in this paper are based 
on redshift-space distances, as they would be in real data. In 
addition, in all cases the density-defining population (DDP) 
consists of galaxies brighter than the luminosity threshold, 

M r < -19. 

FA measures are often expressed as a local density con- 
trast, determined by counting the number of galaxies within 
a given radius, and taking the ratio with the mean density. 
The density contrast is typically defined as 



N g -N e 



(4) 



where iV g is the number of galaxies found in the aperture, 
and iV g is the mean number of galaxies that would be ex- 
pected in the aperture if the galaxies were randomly dis- 
tributed. The motivation for using apertures of a partic- 
ular size is often so that they enclose all of the galaxies 
within a dark matter halo, while accounting for the effect 
of redshift-space distortions and redshift uncertainties (Ab- 
bas & Sheth 2005; Gallazzi et al. 2009). We use 1 + 5 8 , the 
overdensity in spherical apertures of radius 8 Mpc/h (Cro- 
ton et al. 2005; Abbas & Sheth 2005). Cylindrical apertures 
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Figure 3. 1 + S overdensity distributions for fixed-aperture (FA) 
environment measures [8 Mpc//i spheres (black solid histogram) 
and cylinders (red long-dashed histogram)], and for nearest- 
neighbor (NN) environment measures [0-4,5 (blue short-dashed 
histogram) and S3 (magenta dotted histogram)], which have 
longer tails at large 1 + 5. 



and annuli (Gallazzi et al. 2009; Wilman et al. 2010) yield 
qualitatively similar results. 

NN measures exploit the fact that objects with nearer 
neighbors tend to be found in denser environments. A value 
of A is chosen that specifies the number of neighbors around 
the point of interest. One can define a projected surface 
density or a spherical density: 



A 



<7jV = 



Ew = 



A 



(4/3)7 



(■») 



where rjy is the radius to the A-th nearest neighbor. We 
use the Baldry et al. (2006) measure, which is an average 
of logcrjv for A — 4 and 5 with a redshift limit (+Azc = 
lOOOkm/s) on the DDP, and the Ejy measure for A = 3 de- 
scribed in M12. In order to use these similarly as the fixed- 
aperture overdensities, the NN densities need to be normal- 
ized, and we do this by defining 5 = (a — a) /a , where a is 
one of the two density measures described above [i.e., £3 or 

(log<7 4 +log<T 5 )/2]. 

Figure [3] shows the distribution of these 1 + 8 overden- 
sities in the mock catalogue. Note that the FA overdensity 
distributions have a similar shape (see also de la Torre et al. 
2010), as do those of the NN overdensities. For simplicity, 
throughout the rest of this paper we will usually focus on 
a single FA environment measure, 8 Mpc/ft spheres, and a 
single NN measure, the combination of (74 and 05, which we 
will henceforth refer to as (74,5. 

Clearly, the NN measures have longer tails with larger 
1+5, consistent with the expectation that they probe smaller 
scale environments. (That is, if the NN weights trace the 



environment within haloes, then we expect them to have 
1 + 5 ~ 200; and if the FA weights trace the environment 
around each halo, they should have 1+S ~ 1+crg, where as is 
the rms variance of the linear density fluctuation field within 
8 Mpc/ft spheres.) As we show below, mark correlations al- 
low us to quantify this expectation. But before doing so, we 
note that standardizing the distributions by subtracting the 
mean and dividing by the rms still yields distributions with 
different shapes. 



3.2 The traditional approach 

In what follows, we will illustrate our results using these FA 
and NN overdensities. Specifically, in this section we will in- 
sert 5 in equations ((2j and ((3| to define w a for each galaxy, 
thus adding an environmental dependence to w (luminos- 
ity or colour), and then we will measure the distribution of 
(rescaled) w a in a number of different overdensity bins. 

Figure [4] shows the results for the luminosity marks. 
The various histograms in each panel show the overden- 
sity dependent luminosity distributions, p(L a \5), for various 
choices of a (0, 0.01, and 0.05). The different panels show 
different bins in 5 (lowest and highest 10%; p(L a \5) of inter- 
mediate bins have smaller differences) , and there are clearly 
more luminous galaxies and fewer faint galaxies in dense en- 
vironments. The question is whether the differences between 
the a — counts and the others are statistically significant. 
Obviously, one must be far from 5 = to see a difference; 
how far is far enough depends on a. Kolmogorov-Smirnov 
(KS) tests suggest that large values of a and/or S far from 
are required. Since the large |<5| tails typically contain a 
small fraction of the full sample (Figure[3]), this technique, in 
effect, cannot use the vast majority of the sample to detect 
the fact that environment matters. 

As described earlier in Section^ to allow for a fair com- 
parison, we have rank ordered and rescaled the luminosities 
so that they have the same distributions [p(> L a ) = p(> L), 
where L a are the luminosities with a > 0]. Therefore, the 
overall luminosity distributions are the same by construc- 
tion, but at fixed overdensity the rescaled ones (with a > 0) 
may be shifted from the original one (a = 0). In each panel 
of the figure, we find that the differences between the lumi- 
nosity distributions p(L\8) and p(L a \S) appear to be very 
small, except for a — versus a — 0.05 with the NN over- 
density. 

Figure \S\ shows a similar analysis of (rescaled) (g — r) a 
colours. Recall that the distributions, when averaged over 
all 5, have been rescaled to be the same. It is evident that 
there are fewer red galaxies in underdense regions than in 
very dense ones. In addition, in any overdensity bin, the 
distributions of p(g — r\S) and p[(g — r) a \5] are similar, but 
with significant differences in the red sequence at very low 
and very high overdensities. 

In general, the weak dependence on 1 + S appears to pro- 
duce subtle differences in the luminosity and colour distri- 
butions. We quantify the statistical significance of this with 
Kolmogorov-Smirnov (KS) tests. We find that these in fact 
yield low KS probabilities, indicating that the distributions 
p(w\S), where w is luminosity or colour, do have statistically 
significant differences, even for low values of a. However, the 
significance depends on the number statistics, and typical 
SDSS galaxy catalogues are at least 25 times smaller than 
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Figure 4. Rescaled r-band luminosity functions for bins of envi- 
ronmental overdensity (upper panels: ranked by 8 Mpc/h sphere 
overdensities; lower panels: ranked by 4 th and 5 th NN overden- 
sitics). Only the distributions for the lowest-density (left) and 
highest-density (right) 10 percent are shown. Black solid, red 
dashed, and blue dotted histograms indicate the distributions for 
o = 0, 0.01, and 0.05, respectively. 
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Figure 5. Rescaled g — r colour distributions for bins of en- 
vironmental overdensity (upper panels: 8 Mpc/h spheres; lower 
panels: 4 th and 5 th nearest neighbors). Only the distributions for 
the lowest-density (left) and highest-density (right) 10 percent 
are shown. Black solid, red dashed, and blue dotted histograms 
indicate the distributions for a = 0, 0.01, and 0.05, respectively. 

the mock catalogue used here. When we account for this, at 
fixed density, we obtain Pks=0.14 and 0.51 for p(L a =o.oi\S) 
in the lowest and highest density bins for the FA overdensi- 
ties, respectively, making these distributions statistically in- 
distinguishable, while the corresponding colour distributions 
are marginally distinguishable. For larger values of a, and 
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Figure 6. The colour-density relation, using 8 Mpc/ft sphere 
overdensities (upper panel) and 04 g nearest-neighbor overden- 
sities (lower panel). Black, red, and blue lines show the running 
mean of the relation for a = 0, 0.01, and 0.05, respectively. Dotted 
lines indicate the l-cr range between the 16 and 84 percentiles. 



for the NN overdensities, lower probabilities are obtained 
(Pus < 10~ 3 ), indicating statistically significant differences 
between the luminosity and colour distributions, especially 
near the peak of the red sequence. Note that for these prob- 
ability distribution functions, we have normalized by the 
mean luminosity or colour of the full catalogue; if we normal- 
ize by the mean of a given density bin, then p(w\S) become 
more similar (Pks close to unity are obtained, indicating 
virtually identical distributions), except for a = 0.05 in the 
highest density bin. 

One can also consider the 'colour-density relation' or 
density-dependent red fraction (e.g., Hogg et al. 2003; 
Balogh et al. 2004; Cooper et al. 2006; Weinmann et al. 2006; 
Park et al. 2007) . We show the g — r colour-density relation 
of the mock catalogue in Figure[6] using the 8 Mpc/ft sphere 
overdensities. Note the dip in the relation between the blue 
cloud and red sequence. The colour-density relations for the 
colours modified by (1 + 8) a and rescaled following equa- 
tion Q are also shown. With a > 0, the colours have been 
given an additional environmental dependence, and thus we 
expect them to have a stronger (i.e., steeper) correlation 
with overdensity compared to a = 0. Evidently, the colour- 
density relation is only slightly steepened for a > and 
would be difficult to detect, depending on the environment 
measure used (see also M12). In addition, the relations only 
cross for the red-sequence colours, so the densities in over- 
dense regions must be accurate in order to detect different 
environmental trends. 

The analogous 'luminosity-density' relation (not shown) 
has a similar shape as the luminosity-halo mass relation 
(e.g., More et al. 2009), except that at luminosities fainter 
than L* (the break in the luminosity function) , the relation 
has increased scatter and is no longer monotonic. This be- 
havior is not due to faint satellite galaxies in group/cluster 
environments, which are outnumbered by faint 'field' galax- 



© 0000 RAS, MNRAS 000. HUTU 



6 R. Skibba et al. 



ies; it is instead due to the fact that both FA and NN envi- 
ronment measures do not accurately probe the environments 
of low-mass haloes (see M12 for details). 

3.3 Mark correlations 

We now turn to the mark correlation measurements, as an 
alternative to the traditional approach to quantifying envi- 
ronmental correlations. The result is shown in Figure [7] for 
a — 0.01 and 0.05, using the FA overdensities. 

Note that even a dependence as weak as (1 + <5) 01 re- 
sults in a significantly stronger signal, while the effect of 
a = 0.05 is substantially larger still. To clearly demonstrate 
this, the lower panels of the figures show the ratio of these 
mark correlation functions (triangle and square points) to 
that of the unmodified (a = 0) mark correlation functions 
(solid curves). The dashed lines show the jack-knife errorfl 
of the clustering measurements, indicating that the envi- 
ronmental correlations with a = 0.01 are detectable except 
for the smallest separations. We have also estimated jack- 
knife errors of similar measurements of the SDSS catalogue 
described in Section \2. 21 and have found that these are sys- 
tematically larger than the errors of the mock catalogue's 
mark correlations. Nonetheless, we find that a dependence 
on (1 + S)°'° is still detectable at least for colour marks. 

Note that the quantitative effect of a on the mark cor- 
relation functions is similar. In Appendix [S] we argue that 
the (1 + S) a weighting adds a new mark signal which, for 
a < 1, is proportional to 2a. Therefore, given the mark 
correlation with a particular value of a dependence (e.g., 
a = 0.01), we can predict the signal for a different a (e.g., 
a = 0.05). This expectation is borne out: the predicted mark 
correlation functions with a = 0.05 are nearly identical to 
the measured ones. 

We show analogous mark correlation functions using 
NN overdensities in Figure [8] These overdensities are clearly 
more sensitive to small-scale environmental correlations 
than large-scale ones. In addition, the new mark signal for 
a — 0.01 is larger than for the FA overdensities (triangle 
points in Fig. [7|), which is due to the long tail of the NN 
overdensities at large 1 + S. In Sectionf3] we account for this 
effect and find that these overdensities are particularly sen- 
sitive to environmental correlations on scales smaller than 
600 kpc/ft. An advantage of the mark correlation approach 
is that it quantifies the scale dependence of environmental 
trends, and that it exploits the entire dataset (rather than 
splitting a sample with overdensity cuts, for example). 



4 RANK-ORDERED MARK CORRELATIONS 

4.1 The traditional WW/DD measurement 

We now demonstrate how mark correlation functions quan- 
tify the scale dependence of environmental correlations by 
using the FA and NN estimates of the local density around 
galaxies, 1 + 6, as the weight (or mark). As these quantities 

2 Statistical errors are estimated with "jack-knife" resampling, 
using 27 subsamples of the full mock cube. The variance of the 
clustering measurements yield the error estimates. (For details, 
see Zehavi et al. 2005; Norberg et al. 2009.) 



are intended to be direct probes of galaxy environments, 
rather than indirect ones such as luminosity or colour, one 
would expect stronger mark correlation signals than those 
obtained in the previous section. 

The filled symbols in Figure [5] show the result. The NN 
weights, using 0-4,5 overdensities, produce a much stronger 
signal, especially on small scales. This is consistent with pre- 
vious work, in which a NN local density was used as a weight 
(White & Padmanabhan 2009). In Appendix IBll we also il- 
lustrate the effect of using the small-scale environment as a 
weight. 

The obvious jump in amplitude at r < 2/i _1 Mpc for the 
NN weight is consistent with the expectation that it probes 
scales within haloes. In addition, the fact that the FA signal 
reaches a maximum at lft _1 Mpc, which is roughly the scale 
of a group or cluster, suggests that these are the pairs which 
are in the densest larger-scale environments. The decrease 
on smaller scales indicates that an increasing fraction of the 
closest pairs, which may be low-mass interacting galaxies, 
are not in particularly dramatic larger-scale (~ 8/i _1 Mpc) 
overdensities. 

On the other hand, comparing the FA signal to the NN 
signal is less straightforward. For example, it is not clear 
what to make of the fact that the two weights have the 
same amplitude at scale r > 10/i _1 Mpc (other than that 
pairs separated by 10h~ Mpc have weights which are above 
average by the same factor). This is because the two sets 
of weights have rather different distributions (Figure [3]). We 
explore how to remove this in the next section. 



4.2 Rank ordered marks 

As stated previously, the strength of mark correlations is af- 
fected by the shape of the marks' distributions, which makes 
it difficult to fairly compare the mark correlations of differ- 
ent marks. To remove the effect of the distributions, we first 
rank order the marks. We could then have scaled one of 
the distributions to the other, but this would not allow us 
to compare either of these with a third mark, for example. 
Instead, as a more general solution, we perform the rank 
ordering and then use the rank itself as the mark. The open 
symbols in Figure [9] show the result of doing this and then 
remeasuring WW/DD. (In practice, we rank order and then 
match to a uniform distribution on [1,N]. In this way, all 
marks are scaled to the same distribution, so the mark cor- 
relation signal can be compared between marks. However, 
the matching to a uniform distribution is not really neces- 
sary.) 

The rescaling changes the ratio of the small- to large- 
scale signal dramatically, particularly for the NN weights, 
for which the required rescaling is much larger. Evidently, 
the large 8 tail in Figure [3] contributes significantly to the 
small r signal; while not unexpected, it is nice to see this 
confirmation that the NN weights really do correspond to 
small scale environments. (We will return to the flatness of 
the signal shortly.) 

On the other hand, notice that now the feature at 
lh~ Mpc in the FA signal has gone away. This shows that 
rescaling comes with a cost, since there may be informa- 
tion in the shape of the distribution, which rank-ordering 
removes. To illustrate the effect of rank ordering, a model 
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Figure 7. r-band luminosity (left) and g — r colour (right) mark correlation functions (upper panels), with additional environmental 
correlations using FA overdensities: (1 + 5) a with a = 0.01 (red triangles) and 0.05 (blue squares). The lower panels show the ratio of 
M(a 0)/M(a = 0), to more clearly indicate the effect of the additional environmental correlations. The a-dependence of the effect on 
the mark signal can be easily estimated (open squares; see Appendix [Aj . The dashed lines show the uncertainty of the measurements. 




Figure 8. r-band luminosity (left) and g — r colour (right) mark correlation functions (upper panels), like Figure[7] but with additional 
environmental correlations using NN overdensities: third nearest neighbor (open triangles) and fourth and fifth nearest neighbors (solid 
triangles). The results are shown for a = 0.01, and the effect is relatively strong because of the long tail in the NN overdensity distributions 

(Fig. 13. 
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Figure 9. Mark correlation functions where FA and NN esti- 
mates estimates of the local density (8 Mpc/h spheres and tX4,5, 
respectively) were used as marks. Filled symbols show the original 
measurement, and open symbols show the result of rank ordering 
and rescaling to [0,1] before making the measurement. 



based on cluster and field populations is described in Ap- 
pendix [ 



4.3 Effect of redshift-space distortions 

Throughout this paper we have used environment measures 
based on redshift-space distances. It is important to con- 
sider the effect of redshift-space distortions on these envi- 
ronments, in the context of rank-ordered mark correlations. 
The nonlinear virial motions of galaxies in haloes spread 
out objects in groups and clusters along the line-of-sight to 
produce 'fingers-of-god' (FOG; Jackson 1972; Peebles 1980). 
These small-scale distortions can affect how galaxy environ- 
ments are assessed, as we see using the Park et al. (2007) 
NN environment measure^] in Figure 1101 In the left figure, 
the real-space and z-space overdensities clearly have consid- 
erable scatter between them, and in z-space there is a deficit 
of very large overdensities, due to the FOG spreading them 
out. 

This is also seen in the mark correlation functions (right 
panel, analogous to the rescaled mark correlations in Fig. [9]). 
The FOG distortions can result in underestimated densities 
for small-separation pairs r p < 1 Mpc//i, but overestimated 
densities for more widely separated pairs (r p > 1 Mpc/h) 
where the FOG reach into underdense regions (see also Ab- 
bas & Sheth 2007). We have tested this using smaller-scale 



3 Park et al. (2007) local densities are estimated by using 20 
nearest neighbor galaxies, with the galaxies centrally weighted by 
a spherical adaptive smoothing kernel (and hence not equivalent 
to E20 as defined in Eqn.0. 



overdensities (£3, used in Fig. [S]), which have a similar re- 
sult but the transition between underestimated and overesti- 
mated overdensities occurs at smaller separations. Note that 
the small-scale downturn at r p < 400 kpc/h in the mark cor- 
relations (in the right panel of Fig. llOl and in Fig.[9]with the 
FA overdensity marks) is not due to FOG, since it occurs 
with the real-space overdensities as well; it occurs because 
these environment measures best probe larger-scale environ- 
ments, as opposed to environment measures with smaller 
apertures or fewer neighbours. 

Note too that even though the S r — S z mark correlations 
depart from unity at both small and large scales, rescaling 
to a uniform distribution nonetheless results in 1 + S mark 
correlations that are approximately consistent with the real- 
space ones especially at larger separations, demonstrating 
the utility of the rank-ordered mark correlations. 



4.4 Colour and luminosity 

We remarked in the introduction that it is difficult to com- 
pare the usual measurements of the colour and luminosity 
mark correlations with one another. These measurements 
are shown in Figure [TT] Note that, in contrast to the pre- 
vious section, here we also compare measurements in the 
mocks with similar measurements in the SDSS. The agree- 
ment between the triangles and crosses shows that the mock 
catalogues faithfully reproduce the luminosity and colour 
dependence of clustering; since these WW/DD signals were 
not used to construct the mocks, they represent nontrivial 
tests of the mock-making algorithm. 

While this is reassuring, a puzzle lies in the fact that 
we naively expect colour to correlate more strongly with en- 
vironment than luminosity (e.g., Butcher & Oemler 1984; 
Bower et al. 1998; Diaferio et al. 1999; Blanton et al. 2005), 
but the WW/DD signals do not show this: the amplitude 
of the luminosity mark correlations is stronger than that of 
colour. This is primarily because the two weights have very 
different distributions (Figs. 2] and [5J; that of luminosity is 
much broader, such that bright galaxies have L ^> L. Fig- 
ure [12] shows the result of rank ordering and using the rank 
as a mark instead. Unlike the previous figure, now colour 
clearly produces the stronger signal. It is also worth noting 
that the error bars are much larger for the L-weighted sig- 
nal, showing that a large range of L-ranks contributes at 
each r; this range is much narrower for g — r colour. 

Finally, our rank ordering procedure allows us to com- 
pare these measurements with those in Section 14.21 Com- 
paring the luminosity and colour mark correlation func- 
tions (Fig. lll|) to the local-density mark correlations (Fig. [9j| 
shows that both luminosity and colour produce significantly 
weaker signals. 



5 CONCLUSIONS AND DISCUSSION 

Our key results can be summarised as follows: 

• Mark correlation functions are particularly sensitive 
to environmental correlations, specifically when using en- 
hanced weights of (1 + S) a with a > 0.01, though this sensi- 
tivity depends on the environment measure, scale, and the 
mark's uncertainty. 
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Figure 10. Left: contour plot of real-space vs z-space Park et al. (2007) NN overdensities. Right: rank-ordered and rescaled mark 
correlation functions of these real-space and z-space overdensities (triangle and square points, respectively), analogous to the mark 
correlations in Fig. [9] The <5 r — <5 Z mark correlations (circles) are also shown. 
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Figure 11. r-band luminosity and g — r colour mark correlation 
functions for galaxies with M r < —19 in the SDSS DR7 (crosses) 
and in the mock catalogue (triangles). Error bars show jack-knife 
errors on the measurements, and dotted lines show the scatter 
from randomly scrambling the marks (see text for details). 



• Small environmental correlations are difficult to detect 
with more traditional methods, as highlighted by the (lack 
of) variation of most of the mark distributions at fixed over- 
density, and in the colour-density relation. 

• Rank ordering the marks and then using the rank as the 



Figure 12. Same as previous figure, but now the weights have 
all been rank-ordered and then scaled to a uniform distribution. 
Although the qualitative trends are the same as in the previous 
Figure, now the signal is stronger in the bottom panel (i.e., when 
weighting by colour), indicating that the colour- density correla- 
tion is stronger than is luminosity-density. 



weight provides a simple way to compare results for different 
marks, because it removes any dependence on the marks' 
distributions. 
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The analysis in this paper highlights the advantages 
(and disadvantages) of mark clustering statistics. The fact 
that they are sensitive to weak environmental correlations 
that traditional methods have difficulty detecting demon- 
strates their utility. Mark statistics are particularly useful 
for identifying and quantifying environmental trends. This 
owes to the fact that the statistics of entire samples can be 
folded together, producing clearer correlations than simply 
binning environments or splitting galaxies into 'field' and 
'cluster' subsamples, for example. Nonetheless, one cannot 
determine from these trends alone which galaxies occupy 
which environments; more information is needed (e.g., from 
halo models of galaxy clustering) in order to associate par- 
ticular galaxies with environments of particular halo mass 
or overdensity (Skibba et al. 2006; Skibba & Sheth 2009). 

In contrast, methods that characterize individual lo- 
cal galaxy environments can associate galaxies with over- 
densities, though they too have strengths and weaknesses. 
Fixed-aperture and nearest neighbor overdensities are sensi- 
tive to inter- and intra-halo environments, respectively, con- 
sistent with the findings of MI2 and Haas et al. (2012). 
We showed this with the scale-dependent mark clustering 
measurements, which overlapped at r p ~ 600 kpc//t, within 
the 'one-halo term'; fixed apertures, if sufficiently large, can 
encompass entire haloes as well as some of the surround- 
ing regions. Nonetheless, the interpretation of environmen- 
tal trends can be difficult, and depends crucially on how 
the overdensities are measured and on the density-defining 
population. 

One can also interpret our results in terms of central 
and satellite galaxies in haloes. Since satellite luminosities, 
colours, and stellar masses depend only weakly on halo mass, 
and hence only weakly on the environment (e.g., Skibba et 
al. 2007; van den Bosch et al. 2008; Skibba 2009; see also 
Neistein et al. 2011; De Lucia et al. 2012), the majority 
of the environmental correlations that we detect are due 
to the central galaxies. For example, the colours of central 
galaxies are strongly halo mass dependent, and this is clearly 
shown by the colour mark correlations, which are especially 
sensitive to the dependence on overdensity. 

Finally, we note that rank-ordered mark correlation 
functions are applicable to any comparative analysis of en- 
vironmental trends involving large catalogues of objects in 
surveys or simulations with sufficiently accurate distances 
and marks, and are useful for testing or constraining models. 
Rank-ordered mark correlations could be useful for quanti- 
fying and comparing measures of 'halo assembly bias' (e.g., 
Sheth & Tormen 2004; Wechsler et al. 2006; Harker et al. 
2006; Croton, Gao & White 2007; Croft et al. 2012), such 
that halo formation time, concentration, or occupation is 
weakly correlated with the environment at fixed mass. These 
statistics could also be applied to tests of 'halo abundance 
matching' (HAM; Vale & Ostriker 2006; Conroy, Wechsler 
& Kravtsov 2006; Neistein et al. 2011; Trujillo-Gomez et al. 
2011; Kang et al. 2012) methods, in which central/satellite 
galaxies and dark matter haloes/subhaloes are rank ordered 
by their luminosities, masses, or circular velocities, and their 
cumulative number densities are matched. 
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APPENDIX A: EFFECT OF THE ADDITIONAL 
ENVIRONMENTAL CORRELATION 

We provide an estimate of the effect of the added environ- 
mental correlation (Eqn. [2| on the marked correlation func- 
tion. 

As described in Section]!] the statistic M(r) can be ap- 
proximated by the simple pair count ratio WW/DD, where 
WW is the sum over all pairs with separation r, weighting 
each member of the pair by its mark, and DD is the total 
number of such pairs. To use a specific example in the paper 
(see Sec. [3]), suppose that rather than weighting galaxies by 
their luminosity L, we modify the weight by adding a small 
dependence on the 8 Mpc/ft overdensity, which we will de- 
note 1+58. In this case, the modified weight can be expressed 
as 



w = L(l + 6 s ) a ~L(l + a<5 8 ), 



(Al) 



where a is small. For the mark correlation function, we will 
normalize by the mean mark, and the mean of the above 
expression is simply 



(w) ~ (L(l + aS 8 )) ~ (L) 



(A2) 



Therefore, for a pair of galaxies i and j at separation r, 
we have 



(LjQ. + 8j)(l + aS 8 ,i) Lj(l + <y(l + aSsj)} 
(w) 2 



WW{r) 

(A3) 

If L is not significantly correlated with density (which is not 
quite true, because L is correlated with Af h aio and hence 5), 
then this becomes 

WW(r) = ([(1 + S)(l + aS 8 )]i [(1 + <5)(1 + a5t)] s ) (A4) 



Keeping the lowest order in a, then this can be expanded as 
follows: 

WW(r) ~ {(1 + 5 t )(l + Sj){1 + aS^ + a6s, 3 )) (A5) 
~ (1 + 8i + 5 } i + Sidj + a<5 8 ,i(l + <5; + Sj 

+S i S j ) + aS 8 ,j(l + 5i + Sj + 8iSj)) 
~ 1 + (SiSj) + 2a({5 8t i5i) + (5z,i8j) + (fo.ifWj)) 
Since DD = 1 + (SiSj), we have 

WW 2 (Ss,iSi) + (Ss.iSj) + (Sa^SjSj) 

DD DD ' 

which shows that the new environmental correlation that we 
introduced in the text produces a new signal proportional 
to 2a. The approximations here appear to be accurate, as 
the mark CFs with a = 0.05 can be predicted from those 
with a = 0.01, and vice versa, at all separations r (see e.g. 

Fig. m ■ 



APPENDIX B: SMALL-SCALE 
ENVIRONMENT AS A WEIGHT 

Bl A Toy Model 

We will use a simple toy model to illustrate the effect of 
using the environment as a weight. 

Suppose that all the mass is in haloes which all have the 
same mass m distributed around each halo centre according 
to 

p(r) m exp(-r 2 /2Rl) 

P 



p (27Ti?2)3/2 



(Bl) 



where the final expression defines A„, the central density 
(because (2n) 3 ^ 2 R^ is the volume of the profile). 
Then the unweighted correlation function is 



p m 2 exp(~r 2 /4R 2 ,) 

m ~W (27T 2i?2)3/2 

A v 
2 s / 2 



-r 2 /4Rl 



(B2) 



where the first factor of p/m is the number density of haloes 
(i.e., all the mass is in haloes of mass m). 

If we model the weight as the local value of the density 
smoothed with a fixed aperture of scale s, then the weight 
associated with a distance r from the halo centre is 

w(r) = e- r2 ^ H * + ° 2 K (B3) 

If we define 

R 2 s =R 2 v (R 2 v + s 2 )/(2R 2 v + s 2 ) (B4) 
then the mean weight is 

w = Att j drr 2 p(r)w(r)/m= (Rs/Rv) 3 . (B5) 



Therefore, the normalized weighted correlation function is 

(B6) 



2 — r i IAR i 

e (r) _ p m e 

' m p 2 (27r2i?i)3/2 



To see what this implies for WW/DD, suppose that 
s < R v . Then R 2 -> R 2 v /2. On small scales WW/DD 
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= (1 + + ~ £u>/£ because we are interested 

in the case in which A„ S> 1. In this limit, WW/DD 
» 2 3/2 exp(-r 2 /2i? 2 )/exp(-r 2 /4i? 2 ). This shows that 
WW/DD has the same shape as £ itself, and the small-scale 
amplitude is 2 3 / 2 times the unweighted one. The amplitude 
is 2 3//2 because the assumed Gaussian profile (Eqn. IB If) is 
relatively flat; a centrally cusped profile [such as a Navarro, 
Frenk, & White (1996) one] will produce a stronger signal. 
The small-scale shape of WW/DD should not come as a 
surprise: when s <^ R v then £«, is like the convolution of 
p 2 with itself, making £ w oc £ 2 . More generally, WW/DD 

K £l/(l+* 2 /«?)_ 



B2 A Model in terms of Cluster and Field 
Populations 

To gain intuition about the effect of rank ordering, first 
note that for a list of length N marks, the mean mark is 
[N(N+l)/2]/N = (N+l)/2, so normalizing is particularly 
simple. Now, suppose the distribution of environments were 
bimodal, with one population associated with 'close' pairs 
(separations less than some R c ) in dense regions, and an- 
other with underdense ones. Suppose that 'cluster' galaxies 
have close neighbours but 'field' galaxies do not (i.e. they 
are like hard spheres). Then the total clustering signal is 
fi?(l+6t) = nl(l+£ cc )+n}(l+£f f )+2n c nf(l+£ c f), where 
rit = Uc + Uf and the mean mark is w = (n c w c + rifWf)/nt- 
Therefore, 

WW , , 2 ,, * x , ,a„ ^ „ 2n c w c ntWf(l + f c f) 

(B7) 

If we have rank ordered the marks, then ntw = N(N + 1)/2, 
rifWf = Nf(Nf + l)/2, and n c w c = n t w — nfWf. On scales 
smaller than R c we know that both £ff and £ c / equal —1, 
making 



2 2/ 

-D-D n 2 tD 2 (l + ^ ee ) 

= [1 + N f /(N + 1)] 2 . (B8) 



WW _ n 2 c w 2 c {l + Z cc ) _ (wc 

w 



Thus, the small scale signal is a measure of the field fraction 
Nf/N, but notice that it cannot exceed 4. 

If we were to interpret our measured value of 
WW/DD — 3 in Figure [9] in these terms, we would infer a 
field fraction of about 70%; it is interesting that this implies 
a cluster fraction (30%) that is close to the satellite fraction 
usually quoted in halo model analyses of galaxy clustering 
(e.g. Zehavi et al. 2005; van den Bosch et al. 2007) and the 
satellite fraction of the mock catalogue used in this paper 
fSec. 12.1) . If we assume that on intermediate scales and 
£ c / are both approximately equal to zero, then 

WW = l + (n c /n t ) 2 (w c /w)% c 

DD 1 + (n c /n t ) 2 £ cc 1 ' 

In this approximation, the scale dependence of WW/DD 
codes information about the cluster or field fraction, and the 
correlation function of the cluster population. In the £(r) ^> 
1 limit, it smoothly asymptotes to the previous expression. 



